Search CORE

Enhanced Failure Detection Mechanism in MapReduce

Author: Antoniu Gabriel
Memishi Bunjamin
Pérez Hernández María de los Santos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

The popularity of MapReduce programming model has increased interest in the research community for its improvement. Among the other directions, the point of fault tolerance, concretely the failure detection issue seems to be a crucial one, but that until now has not reached its satisfying level. Motivated by this, I decided to devote my main research during this period into having a prototype system architecture of MapReduce framework with a new failure detection service, containing both analytical (theoretical) and implementation part. I am confident that this work should lead the way for further contributions in detecting failures to any NoSQL App frameworks, and cloud storage systems in general

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Arquitectura multiagente para E/S de alto rendimiento en clusters

Author: Pérez Hernández María de los Santos
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2003
Field of study

La E/S constituye en la actualidad uno de los principales cuellos de botella de los sistemas distribuidos de propósito general, debido al desequilibrio existente entre el tiempo de cómputo y de E/S. Una de las soluciones propuestas para este problema ha sido el uso de la E/S paralela. En esta área, se han originado un gran número de bibliotecas de E/S paralela y sistemas de ficheros paralelos. Este tipo de sistemas adolecen de algunos defectos y carencias. Muchos de ellos están concebidos para máquinas paralelas y no se integran adecuadamente en entornos distribuidos y clusters. El uso intensivo de clusters de estaciones de trabajo durante estos últimos años hace que este tipo de sistemas no sean adecuados en el escenario de computación actual. Otros sistemas, que se adaptan a este tipo de entornos, no incluyen capacidades de reconfiguración dinámica, por lo que tienen una funcionalidad limitada. Por último, la mayoría de los sistemas de E/S que utilizan diferentes optimizaciones de E/S, no ofrecen flexibilidad a las aplicaciones para hacer uso de las mismas, intentando ocultar al usuario este tipo de técnicas. No obstante, a fin de optimizar las operaciones de E/S, es importante que las aplicaciones sean capaces de describir sus patrones de acceso, interactuando con el sistema de E/S. En otro ámbito, dentro del área de los sistemas distribuidos se encuentra el paradigma de agentes, que permite dotar a las aplicaciones de un conjunto de propiedades muy adecuadas para su adaptación a entornos complejos y dinámicos. Las características de este paradigma lo hacen a priori prometedor para abordar algunos de los problemas existentes en el campo de la E/S paralela. Esta tesis propone una solución a la problemática actual de E/S a través de tres líneas principales: (i) el uso de la teoría de agentes en sistemas de E/S de alto rendimiento, (ii) la definición de un formalismo que permita la reconfiguración dinámica de nodos de almacenamiento en un cluster y (iii) el uso de técnicas de optimización de E/S configurables y orientadas a las aplicaciones

Archivio Ricerca Ca'Foscari

Parallel and Distributed Data Management. Introduction

Author: Antoniu Gabriel
Ghoting Amol
Orlando S.
Pérez Hernández María de los Santos
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2011
Field of study

The manipulation and handling of an ever increasing volume of data by current data-intensive applications require novel techniques for e?cient data management. Despite recent advances in every aspect of data management (storage, access, querying, analysis, mining), future applications are expected to scale to even higher degrees, not only in terms of volumes of data handled but also in terms of users and resources, often making use of multiple, pre-existing autonomous, distributed or heterogeneous resources

Archivio istituzionale della ricerca - Università degli Studi di Venezia Ca' Foscari

Harmony: Towards automated self-adaptive consistency in cloud storage

Author: Antoniu Gabriel
Chihoub Houssem-Eddine
Ibrahim Shadi
Pérez Hernández María de los Santos
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2012
Field of study

In just a few years cloud computing has become a very popular paradigm and a business success story, with storage being one of the key features. To achieve high data availability, cloud storage services rely on replication. In this context, one major challenge is data consistency. In contrast to traditional approaches that are mostly based on strong consistency, many cloud storage services opt for weaker consistency models in order to achieve better availability and performance. This comes at the cost of a high probability of stale data being read, as the replicas involved in the reads may not always have the most recent write. In this paper, we propose a novel approach, named Harmony, which adaptively tunes the consistency level at run-time according to the application requirements. The key idea behind Harmony is an intelligent estimation model of stale reads, allowing to elastically scale up or down the number of replicas involved in read operations to maintain a low (possibly zero) tolerable fraction of stale reads. As a result, Harmony can meet the desired consistency of the applications while achieving good performance. We have implemented Harmony and performed extensive evaluations with the Cassandra cloud storage on Grid?5000 testbed and on Amazon EC2. The results show that Harmony can achieve good performance without exceeding the tolerated number of stale reads. For instance, in contrast to the static eventual consistency used in Cassandra, Harmony reduces the stale data being read by almost 80% while adding only minimal latency. Meanwhile, it improves the throughput of the system by 45% while maintaining the desired consistency requirements of the applications when compared to the strong consistency model in Cassandra

Crossref

Using Global Behavior Modeling to Improve QoS in Cloud Data Storage Services

Author: Antoniu Gabriel
Montes Jesús
Nicolae Bogdan
Pérez Hernández María de los Santos
Sánchez Alberto
Publication venue: E.U. de Informática (UPM)
Publication date: 01/02/2011
Field of study

The cloud computing model aims to make large-scale data-intensive computing affordable even for users with limited financial resources, that cannot invest into expensive infrastructures necesssary to run them. In this context, MapReduce is emerging as a highly scalable programming paradigm that enables high-throughput data-intensive processing as a cloud service. Its performance is highly dependent on the underlying storage service, responsible to efficiently support massively parallel data accesses by guaranteeing a high throughput under heavy access concurrency. In this context, quality of service plays a crucial role: the storage service needs to sustain a stable throughput for each individual accesss, in addition to achieving a high aggregated throughput under concurrency. In this paper we propose a technique to address this problem using component monitoring, application-side feedback and behavior pattern analysis to automatically infer useful knowledge about the causes of poor quality of service and provide an easy way to reason in about potential improvements. We apply our proposal to Blob Seer, a representative data storage service specifically designed to achieve high aggregated throughputs and show through extensive experimentation substantial improvements in the stability of individual data read accesses under MapReduce workloads

GMonE: a complete approach to cloud monitoring

Author: Antoniu Gabriel
Memishi Bunjamin
Montes Jesús
Pérez Hernández María de los Santos
Sánchez Alberto
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

The inherent complexity of modern cloud infrastructures has created the need for innovative monitoring approaches, as state-of-the-art solutions used for other large-scale environments do not address specific cloud features. Although cloud monitoring is nowadays an active research field, a comprehensive study covering all its aspects has not been presented yet. This paper provides a deep insight into cloud monitoring. It proposes a unified cloud monitoring taxonomy, based on which it defines a layered cloud monitoring architecture. To illustrate it, we have implemented GMonE, a general-purpose cloud monitoring tool which covers all aspects of cloud monitoring by specifically addressing the needs of modern cloud infrastructures. Furthermore, we have evaluated the performance, scalability and overhead of GMonE with Yahoo Cloud Serving Benchmark (YCSB), by using the OpenNebula cloud middleware on the Grid’5000 experimental testbed. The results of this evaluation demonstrate the benefits of our approach, surpassing the monitoring performance and capabilities of cloud monitoring alternatives such as those present in state-of-the-art systems such as Amazon EC2 and OpenNebula

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

An autonomic framework for enhancing the quality of data grid services

Author: Cortes Rosello Antonio
Montes Jesús
Pérez Hernández María de los Santos
Sánchez Campos Alberto
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Data grid services have been used to deal with the increasing needs of applications in terms of data volume and throughput. The large scale, heterogeneity and dynamism of grid environments often make management and tuning of these data services very complex. Furthermore, current high-performance I/O approaches are characterized by their high complexity and specific features that usually require specialized administrator skills. Autonomic computing can help manage this complexity. The present paper describes an autonomic subsystem intended to provide self-management features aimed at efficiently reducing the I/O problem in a grid environment, thereby enhancing the quality of service (QoS) of data access and storage services in the grid. Our proposal takes into account that data produced in an I/O system is not usually immediately required. Therefore, performance improvements are related not only to current but also to any future I/O access, as the actual data access usually occurs later on. Nevertheless, the exact time of the next I/O operations is unknown. Thus, our approach proposes a long-term prediction designed to forecast the future workload of grid components. This enables the autonomic subsystem to determine the optimal data placement to improve both current and future I/O operations

Towards efficient localization of dynamic replicas for Geo-Distributed data stores

Author: Antoniu Gabriel
Costan Alexandru
Matri Pierre
Montes Sánchez Jesús
Pérez Hernández María de los Santos
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Large-scale scientific experiments increasingly rely on geo- distributed clouds to serve relevant data to scientists world- wide with minimal latency. State-of-the-art caching systems often require the client to access the data through a caching proxy, or to contact a metadata server to locate the closest available copy of the desired data. Also, such caching sys- tems are inconsistent with the design of distributed hash- table databases such as Dynamo, which focus on allowing clients to locate data independently. We argue there is a gap between existing state-of-the-art solutions and the needs of geographically distributed applications, which require fast access to popular objects while not degrading access latency for the rest of the data. In this paper, we introduce a proba- bilistic algorithm allowing the user to locate the closest copy of the data e?ciently and independently with minimal over- head, allowing low-latency access to non-cached data. Also, we propose a network-e?cient technique to identify the most popular data objects in the cluster and trigger their replica- tion close to the clients. Experiments with a real-world data set show that these principles allow clients to locate the clos- est available copy of data with small memory footprint and low error-rate, thus improving read-latency for non-cached data and allowing hot data to be read locally